sky survey
Galaxy image simplification using Generative AI
Erukude, Sai Teja, Shamir, Lior
Modern digital sky surveys have been acquiring images of billions of galaxies. While these images often provide sufficient details to analyze the shape of the galaxies, accurate analysis of such high volumes of images requires effective automation. Current solutions often rely on machine learning annotation of the galaxy images based on a set of pre-defined classes. Here we introduce a new approach to galaxy image analysis that is based on generative AI. The method simplifies the galaxy images and automatically converts them into a ``skeletonized" form. The simplified images allow accurate measurements of the galaxy shapes and analysis that is not limited to a certain pre-defined set of classes. We demonstrate the method by applying it to galaxy images acquired by the DESI Legacy Survey. The code and data are publicly available. The method was applied to 125,000 DESI Legacy Survey images, and the catalog of the simplified images is publicly available.
Estimating Probability Densities with Transformer and Denoising Diffusion
Leung, Henry W., Bovy, Jo, Speagle, Joshua S.
Transformers are often the go-to architecture to build foundation models that ingest a large amount of training data. But these models do not estimate the probability density distribution when trained on regression problems, yet obtaining full probabilistic outputs is crucial to many fields of science, where the probability distribution of the answer can be non-Gaussian and multimodal. In this work, we demonstrate that training a probabilistic model using a denoising diffusion head on top of the Transformer provides reasonable probability density estimation even for high-dimensional inputs. The combined Transformer+Denoising Diffusion model allows conditioning the output probability density on arbitrary combinations of inputs and it is thus a highly flexible density function emulator of all possible input/output combinations. We illustrate our Transformer+Denoising Diffusion model by training it on a large dataset of astronomical observations and measured labels of stars within our Galaxy and we apply it to a variety of inference tasks to show that the model can infer labels accurately with reasonable distributions.
Applications of AI in Astronomy
Djorgovski, S. G., Mahabal, A. A., Graham, M. J., Polsterer, K., Krone-Martins, A.
We provide a brief, and inevitably incomplete overview of the use of Machine Learning (ML) and other AI methods in astronomy, astrophysics, and cosmology. Astronomy entered the big data era with the first digital sky surveys in the early 1990s and the resulting Terascale data sets, which required automating of many data processing and analysis tasks, for example the star-galaxy separation, with billions of feature vectors in hundreds of dimensions. The exponential data growth continued, with the rise of synoptic sky surveys and the Time Domain Astronomy, with the resulting Petascale data streams and the need for a real-time processing, classification, and decision making. A broad variety of classification and clustering methods have been applied for these tasks, and this remains a very active area of research. Over the past decade we have seen an exponential growth of the astronomical literature involving a variety of ML/AI applications of an ever increasing complexity and sophistication. ML and AI are now a standard part of the astronomical toolkit. As the data complexity continues to increase, we anticipate further advances leading towards a collaborative human-AI discovery.
Virtual Observatories, Data Mining, and Astroinformatics
The historical, current, and future trends in knowledge discovery from data in astronomy are presented here. The story begins with a brief history of data gathering and data organization. A description of the development ofnew information science technologies for astronomical discovery is then presented. Among these are e-Science and the virtual observatory, with its data discovery, access, display, and integration protocols; astroinformatics and data mining for exploratory data analysis, information extraction, and knowledge discovery from distributed data collections; new sky surveys' databases, including rich multivariate observational parameter sets for large numbers of objects; and the emerging discipline of data-oriented astronomical research, called astroinformatics. Astroinformatics is described as the fourth paradigm of astronomical research, following the three traditional research methodologies: observation, theory, and computation/modeling. Astroinformatics research areas include machine learning, data mining, visualization, statistics, semantic science, and scientific data management.Each of these areas is now an active research discipline, with significantscience-enabling applications in astronomy. Research challenges and sample research scenarios are presented in these areas, in addition to sample algorithms for data-oriented research. These information science technologies enable scientific knowledge discovery from the increasingly large and complex data collections in astronomy. The education and training of the modern astronomy student must consequently include skill development in these areas, whose practitioners have traditionally been limited to applied mathematicians, computer scientists, and statisticians. Modern astronomical researchers must cross these traditional discipline boundaries, thereby borrowing the best of breed methodologies from multiple disciplines. In the era of large sky surveys and numerous large telescopes, the potential for astronomical discovery is equally large, and so the data-oriented research methods, algorithms, and techniques that are presented here will enable the greatest discovery potential from the ever-growing data and information resources in astronomy.
Self-supervised machine learning adds depth, breadth and speed to sky surveys
Sky surveys are invaluable for exploring the universe, allowing celestial objects to be catalogued and analyzed without the need for lengthy observations. But in providing a general map or image of a region of the sky, they are also one of the largest data generators in science, currently imaging tens of millions to billions of galaxies over the lifetime of an individual survey. In the near future, for example, the Vera C. Rubin Observatory in Chile will produce 20 TB of data per night, generate about 10 million alerts daily, and end with a final data set of 60 PB in size. As a result, sky surveys have become increasingly labor-intensive when it comes to sifting through the gathered datasets to find the most relevant information or new discovery. In recent years machine learning has added a welcome twist to the process, primarily in the form of supervised and unsupervised algorithms used to train the computer models that mine the data.
Speedy robots gather spectra for sky surveys
It was one of the stranger and more monotonous jobs in astronomy: plugging optical fibers into hundreds of holes in aluminum plates. Every day, technicians with the Sloan Digital Sky Survey (SDSS) prepped up to 10 plates that would be placed that night at the focus of the survey's telescopes in Chile and New Mexico. The holes matched the exact positions of stars, galaxies, or other bright objects in the telescopes' view. Light from each object fell directly on a fiber and was whisked off to a spectrograph, which split the light into its component wavelengths, revealing key details such as what the object is made of and how it is moving. Now, after 20 years, the SDSS is going robotic. For the project's upcoming fifth set of surveys, known as the SDSS-V, plug plates are being replaced by 500 tiny robot arms, each holding fiber tips that patrol a small area of the telescope's focal plane. They can be reconfigured for a new sky map in 2 minutes. Other sky surveys are also adopting the speedy robots. They will not only save valuable observation time, but also allow the surveys to keep up with Europe's Gaia satellite, the upcoming Vera C. Rubin Observatory in Chile, and other efforts that produce huge catalogs of objects needing spectroscopic study. โIt's driven by the science of enormous imaging surveys,โ says astronomer Richard Ellis of University College London. COVID-19 has delayed the SDSS's robotic makeover. The survey's northern telescope at Apache Point Observatory in New Mexico began to take SDSS-V data in October 2020 using plug plates. It aims to switch over to the robots by mid-2021. The southern scope at Las Campanas Observatory in Chile will follow later in the year. โIt's bananas,โ says SDSS-V Director Juna Kollmeier of the Carnegie Observatories, โbut we're seeing the end of the tunnel.โ The robots mark a new chapter for the SDSS. For 10 years much of its time went to the study of dark energy, the mysterious force that is accelerating the universe's expansion. The SDSS prised apart the light of millions of galaxies to determine their distance, via a redshiftโa Doppler shift in their light due to the expansion of the universe, like the wail of a receding siren. Results from the galaxy survey, released in July 2020, traced the universe's expansion back through 80% of its history with 1% precision, confirming the effects of dark energy, perhaps the biggest mystery in cosmology. Cracking it will require looking further back in time to fainter galaxies, which is beyond the capabilities of the survey's 2.5-meter telescopes. Instead, the scopes will carry out three new surveys. Milky Way Mapper will gather spectra from 6 million stars, probing their composition to find out how long they've been burning and forging heavy elements. โStars are all clocks,โ Kollmeier explains. With age estimates, astronomers can work out when parts of the Milky Way formed. Subtle shifts in composition can also reveal whether a group of stars originated in another galaxy or star cluster that has been subsumed into oursโan unwinding of Milky Way history called galactic archaeology. In a second survey, Black Hole Mapper, the optical fibers will gather light from bright galaxies to learn about the supermassive black holes they harbor. Doppler shifts in the spectra of glowing gases surrounding these black holes could reveal how fast they fling this material aroundโand thus how heavy they are. Shifts in the spectra could trace how they gobble up and spit out streams of this gas. By tracking the gases over time, Kollmeier says, astronomers may learn how the black holes grow, seemingly in concert with their galaxies. The third survey, Local Volume Mapper, will bunch fibers together like a multipixel detector to get spectra from clouds of interstellar gas within nearby galaxies. โWe're mapping a whole galaxy in exquisite detail at one time,โ Kollmeier says. By determining the motions and composition of the gas clouds, the SDSS team hopes to identify why some collapse into stars and others don't. Meanwhile, the dark energy quest pioneered by the SDSS will move to the Dark Energy Spectroscopic Instrument, a 5000-fiber robotic spectrograph on a 4-meter telescope in Arizona. It will soon begin to track the distances to tens of millions of galaxies in the remote universe ( Science , 13 September 2019, p. [1066][1]). ![Figure][2] In the coming months, the William Herschel Telescope, a 4.2-meter telescope in the Canary Islands, will join the robot revolution by sending light to a 1000-fiber spectrograph called the WHT Enhanced Area Velocity Explorer (WEAVE). Instead of using robots to hold fibers in place, WEAVE has two of them working offline, picking and placing magnetic fiber ends onto a metal plateโautomating what the SDSS's plate pluggers did. One of WEAVE's goals is to gather Doppler shifts from the billion stars Gaia has mapped, nailing down their full 3D motions. Then, โWe can run the clock backwards and see where they came from,โ says project scientist Scott Trager of the University of Groningen. It's another way to do galactic archeology. Next year, the European Southern Observatory's (ESO's) 4-metre Multi-Object Spectroscopic Telescope in Chile will be fitted with yet another robotic technology. Its 2400 fibers will be fed through controllable โspinesโ that stick up into the telescope's focal plane and can be made to move, like wheat stalks in a breeze. Like WEAVE, it will follow up on sources identified by European spacecraft, including Gaia and Euclid, an upcoming dark energy mission. It and other fiber spectrographs will also help with studies of fast-moving cosmic events such as supernovae or the violent collisions that produce gravitational waves. The Rubin Observatory will spot many of them. From 2023, it's expected to detect 10 million fast-changing objects every night. For the thousands that demand scrutiny, โspectra are really critical for understanding what a source is,โ says Eric Bellm of the University of Washington, Seattle, who is the science lead for Rubin's alert stream. Even some of the world's largest scopes, in the 8-meter range, are adding robotic spectrographs. Japan's Subaru and ESO's Very Large Telescope are both developing systems that will vacuum up spectra from faint, distant objects. Ellis says a fiber spectrograph combined with Subaru's 8.2-meter mirror would be able to pick out spectra of individual stars in the Andromeda galaxy, the Milky Way's nearby twin. โWith a big telescope, we can do galactic archaeology in our nearest neighbor,โ he says. [1]: http://www.sciencemag.org/content/365/6458/1066 [2]: pending:yes
From Digitized Images to Online Catalogs Data Mining a Sky Survey
The value of scientific digital-image libraries seldom lies in the pixels of images. For large collections of images, such as those resulting from astronomy sky surveys, the typical useful product is an online database cataloging entries of interest. We focus on the automation of the cataloging effort of a major sky survey and the availability of digital libraries in general. The SKICAT system automates the reduction and analysis of the three terabytes worth of images, expected to contain on the order of 2 billion sky objects. For the primary scientific analysis of these data, it is necessary to detect, measure, and classify every sky object.
A Big Data Journey While Seeking to Catalog our Universe
It turns out, astronomers have lots of photos of the sky but seek knowledge about what the photos mean. Big data problems are often characterized as transforming data into insights โ which is exactly what some ambitious scientists are working to do with "Sky Survey" data. A Sky Survey is essentially astronomer speak for "lots and lots of images taken by telescopes, along with information of when and where they were taken." The Celeste collaboration is a group of scientists who have worked to catalog the visible universe in a way never before accomplished. They seek to create and refine a catalog which can detail the placement and characteristics (such as brightness and rotation) of every visible object in the sky.
Galaxy is a 'Frankenstein'
A seemingly nondescript galaxy is actually a behemoth cobbled together out of various cosmic spare parts, Frankenstein-style, a new study suggests. UGC 1382, which lies about 250 million light-years from Earth, had long been regarded as an old, small and ordinary elliptical galaxy. But new observations show that it's actually a spiral, 718,000 light-years wide -- seven times larger than Earth's own Milky Way. "The center of UGC 1382 is actually younger than the spiral disk surrounding it," study co-author Mark Seibert, of the Observatories of the Carnegie Institution for Science in Pasadena, California, said in a statement. This is like finding a tree whose inner growth rings are younger than the outer rings."